fix(ci): trigger Algolia crawler reindex via API#428
Conversation
- Purpose: replace the previous Algolia action-based workflow with the exact crawler reindex flow observed in the Algolia dashboard HAR and document the required repo setup. - Before: the workflow depended on an extra standard Algolia API key and a third-party action instead of directly calling the crawler API used by the dashboard. - Problem: that setup asked for unnecessary credentials, hid the real reindex mechanism, and was harder to reason about when debugging or rotating secrets. - Now: the workflow resolves the crawler by name, triggers the crawler reindex endpoint with crawler-specific credentials only, waits for deployment propagation on main pushes, and verifies the crawler enters reindexing state. - How: add a push trigger for published docs changes, use curl+jq against crawler.algolia.com, keep manual and reusable entry points, and document the exact secrets and optional variables in the README.
📝 WalkthroughWalkthroughThe changes introduce custom GitHub Actions workflow automation for Algolia search index reindexing. The workflow replaces a marketplace action with explicit API calls to Algolia's crawler API, adds automatic triggering on documentation changes, implements concurrency controls, and includes polling logic to verify reindex completion. Changes
Sequence DiagramsequenceDiagram
participant GitHub as GitHub Actions
participant API as Algolia Crawler API
participant Poll as Polling Loop
GitHub->>API: GET /user_configs (resolve crawler ID)
API-->>GitHub: Return crawler ID & status
GitHub->>API: POST /reindex (trigger reindex)
API-->>GitHub: Return action_id
GitHub->>Poll: Start polling (up to 5 attempts)
Poll->>API: GET crawler status
API-->>Poll: Check reindexing flag
alt reindexing=true
Poll-->>GitHub: Success ✓
else reindexing≠true
Poll-->>GitHub: Retry or Fail
end
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
unraid-docs | 2e87826 | Commit Preview URL Branch Preview URL |
Mar 24 2026, 06:23 PM |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
.github/workflows/algolia-reindex.yml (2)
57-62: Consider edge case: multiple crawlers matching the name.If multiple crawlers share the same name,
jqwill output multiple lines, causing the variable assignment to capture only the first or behave unexpectedly. While unlikely for this use case, adding| firstorlimit(1; ...)would make the extraction defensive.♻️ Optional defensive fix
crawler_id="$( jq -er \ --arg crawler_name "${ALGOLIA_CRAWLER_NAME}" \ - '.data[] | select(.name == $crawler_name) | .id' \ + '[.data[] | select(.name == $crawler_name)] | first | .id' \ <<<"${response}" )"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/algolia-reindex.yml around lines 57 - 62, The jq extraction for crawler_id can return multiple matches if more than one crawler shares ALGOLIA_CRAWLER_NAME; update the jq filter used in the pipeline where crawler_id is assigned to defensively take only the first match (e.g. use limit(1; .data[] | select(.name == $crawler_name) | .id) or pipe the stream to first) so the assignment to crawler_id always captures a single id even when multiple crawlers match.
100-137: Consider usingaction_idfor precise verification.The step captures
action_idin the previous step but doesn't use it here. While checking.reindexing == trueworks, using the action-specific endpoint could provide more precise status tracking. However, the current approach is functional and avoids additional API complexity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/algolia-reindex.yml around lines 100 - 137, The step polls the crawler by matching .name and checking .reindexing, but you captured an action_id earlier and should use it for precise verification; update the curl request or jq filter in this step to query the action-specific endpoint or include the action_id parameter (use the captured variable name ACTION_ID or whatever the previous step outputs) and change the jq selector from '.data[] | select(.name == $crawler_name) | .reindexing' to select the entry by action_id (e.g., select(.action_id == $action_id) | .reindexing) and similarly for .status, so the loop directly inspects the specific reindex action rather than matching by crawler name. Ensure the env var is passed into the step and keep the existing retry/sleep logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/algolia-reindex.yml:
- Around line 57-62: The jq extraction for crawler_id can return multiple
matches if more than one crawler shares ALGOLIA_CRAWLER_NAME; update the jq
filter used in the pipeline where crawler_id is assigned to defensively take
only the first match (e.g. use limit(1; .data[] | select(.name == $crawler_name)
| .id) or pipe the stream to first) so the assignment to crawler_id always
captures a single id even when multiple crawlers match.
- Around line 100-137: The step polls the crawler by matching .name and checking
.reindexing, but you captured an action_id earlier and should use it for precise
verification; update the curl request or jq filter in this step to query the
action-specific endpoint or include the action_id parameter (use the captured
variable name ACTION_ID or whatever the previous step outputs) and change the jq
selector from '.data[] | select(.name == $crawler_name) | .reindexing' to select
the entry by action_id (e.g., select(.action_id == $action_id) | .reindexing)
and similarly for .status, so the loop directly inspects the specific reindex
action rather than matching by crawler name. Ensure the env var is passed into
the step and keep the existing retry/sleep logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e91267b4-2fed-4fb8-8842-c5c3f1ec849b
📒 Files selected for processing (2)
.github/workflows/algolia-reindex.ymlREADME.md
Summary
Validation
.github/workflows/algolia-reindex.ymlsuccessfully with Ruby YAMLNotes
JUYLFQHE7W, crawlerunraidworkflow_dispatchandworkflow_callsupport while adding automatic runs for content changes onmainSummary by CodeRabbit
Documentation
Chores